In this part of my project I will refine my research questions. I will further examine the effects of the pandemic on recent MCPS highschool graduates enrolled at Montgomery College. For the purposes of this study I will limit my dataset to MCPS students under the age of 20. These MCPS students will be divided further into subgroups based on Gender and Race. The datasets used in this part of my project have already been cleaned in my initial data analysis.
For the purposes of this Project the following variables and definitions are important.
Terminology:
Fall2019 refers to the incoming freshman cohort in Fall2019. This is term year 2020.
Fall2020 refers to the incoming freshman cohort in Fall2020. This is term year 2021.
Variables of Interest:
term year: incoming students in Fall 2019 are assigned to term year 2020. Incoming students in Fall 2020 are assigned to term year 2021.
hours_earned: refers to credit hours the student has earned in their first Fall semester. This can include credits earned in Summer school second session (Summer 2) and AP credits earned in high school.
hours_attempted: refers to credit and non credit hours the student has attempted in their first Fall semester. This may include credits attempted in Summerschool second session - Summer 2.
full_part: is the student full-time (FT) or part-time (PT). This classification is based on the students self reported information in the admissions application. Students are classified as full-time if they intend to take at least 12 credits.
major: degree programme student is registered for or certificate&LR ( letter of recommendation.) All certificates and letters of recommendations have been grouped together.
hours_earned_rate: Ratio of hours_earned/hours_attempted
age: age of student at the start of program.
race: racial classification of student. This is based on the IPEDS system. Foreign students are identified as foreign and not by their race/ethnicity.
sex: gender classification of student.
high_school: name of highschool student graduted from. Public High schools in Montgomery county are classified as MCPS.
pell: Whether the student received a pell grant or not.
Summary of Data and Types
skim(df_Degrees)
| Name | df_Degrees |
| Number of rows | 7123 |
| Number of columns | 23 |
| _______________________ | |
| Column type frequency: | |
| character | 15 |
| logical | 1 |
| numeric | 7 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| sex | 0 | 1.00 | 1 | 1 | 0 | 4 | 0 |
| race | 0 | 1.00 | 5 | 22 | 0 | 9 | 0 |
| age | 0 | 1.00 | 4 | 7 | 0 | 5 | 0 |
| high_school | 0 | 1.00 | 7 | 30 | 0 | 163 | 0 |
| full_part | 0 | 1.00 | 2 | 2 | 0 | 2 | 0 |
| city | 19 | 1.00 | 5 | 19 | 0 | 127 | 0 |
| stat_code | 19 | 1.00 | 2 | 2 | 0 | 16 | 0 |
| pell_grant | 0 | 1.00 | 1 | 1 | 0 | 2 | 0 |
| camp_code | 140 | 0.98 | 1 | 1 | 0 | 6 | 0 |
| major | 0 | 1.00 | 1 | 61 | 0 | 34 | 0 |
| pass_engl | 0 | 1.00 | 1 | 1 | 0 | 2 | 0 |
| pass_math | 0 | 1.00 | 1 | 1 | 0 | 2 | 0 |
| summer2 | 0 | 1.00 | 1 | 1 | 0 | 1 | 0 |
| fall | 0 | 1.00 | 1 | 1 | 0 | 1 | 0 |
| HS_classify | 0 | 1.00 | 2 | 14 | 0 | 7 | 0 |
Variable type: logical
| skim_variable | n_missing | complete_rate | mean | count |
|---|---|---|---|---|
| MCPS | 0 | 1 | 0.7 | TRU: 4963, FAL: 2160 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| u_number | 0 | 1 | 20196625.60 | 5027.06 | 20190001 | 20191872.50 | 20193733.00 | 20201703.5 | 20203588.0 | ▇▃▁▂▇ |
| zip | 19 | 1 | 20886.64 | 1559.40 | 1460 | 20853.00 | 20877.00 | 20903.0 | 94025.0 | ▁▇▁▁▁ |
| hours_attempted | 0 | 1 | 12.46 | 6.23 | 1 | 9.00 | 12.00 | 15.0 | 54.0 | ▆▇▁▁▁ |
| hours_earned | 0 | 1 | 7.85 | 7.43 | 0 | 3.00 | 6.00 | 12.0 | 54.0 | ▇▃▁▁▁ |
| mc_gpa | 0 | 1 | 2.19 | 1.47 | 0 | 0.67 | 2.50 | 3.5 | 4.0 | ▆▂▃▅▇ |
| term_year | 0 | 1 | 2020.47 | 0.50 | 2020 | 2020.00 | 2020.00 | 2021.0 | 2021.0 | ▇▁▁▁▇ |
| hours_earned_rate | 0 | 1 | 0.57 | 0.38 | 0 | 0.23 | 0.64 | 1.0 | 3.2 | ▇▇▁▁▁ |
Change Datatypes
df_Degrees$u_number<- as.character(df_Degrees$u_number)
df_Degrees$term_year<- as.character(df_Degrees$term_year)
Use the dataframe df_Degrees which has been cleaned in the initial data analysis. Filter all MCPS students who are 20yrs and younger in age.
df_MCPS20D<-df_Degrees %>%
filter(HS_classify=="MCPS")%>% # filter degrees dataset to obtain students who graduated MCPS highschools
filter(age=='18 - 20' | age =="< 18") # filter students who are 20yrs old and younger.
Frequency of Students Part time versus Full tim: 2020 vs 2021
# Number of students part time abnd full time 2020 vs 2021
ggplot(data=df_MCPS20D, aes(x=full_part, fill=full_part)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=2,size=3)+
facet_wrap(~term_year)+
ggtitle("Number of Students Full time versus Part time")+
ylab('Frequency')+
xlab("")+
theme(axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
Proportion of Students Full time versus Part time: 2020 vs 2021
df_MCPS20D %>%
group_by(term_year) %>%
count(full_part) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = full_part, y = prop)) +
geom_col(aes(fill = full_part), position = "dodge") +
geom_text(aes(label = scales::percent(prop),
y = prop,
group = full_part),
position = position_dodge(width = 0.9),
vjust = 2,size=3)+
facet_wrap(~term_year)+
ggtitle("Proportion of Students Full time versus Part time")+
ylab('Percentage')+
xlab("")+
theme(axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
# change in overall MCPS student population from 2020 to 2021
df_MCPS20D%>%
group_by(term_year,full_part)%>%
count(full_part)%>%
group_by(term_year)%>%
mutate(total_pop =sum(n))%>%
group_by(full_part)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 4 x 5
## # Groups: full_part [2]
## term_year full_part n total_pop pct_change
## <chr> <chr> <int> <int> <dbl>
## 1 2020 FT 1655 2456 NA
## 2 2021 FT 1556 2303 -5.98
## 3 2020 PT 801 2456 NA
## 4 2021 PT 747 2303 -6.74
There was a 5.98% decrease in full time students who graduated from MCPS highschools in term year 2021. There was a -6.74% decrease in part time students who graduated from MCPS.
Count of Race Groups
ggplot(data=df_MCPS20D, aes(x=race, fill=race)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=0,size=3)+
facet_wrap(~term_year + full_part)+
theme(axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())+
ggtitle("Number of Students per a Race Group")+
xlab("Race")+
ylab("Frequency")
Full time student: Change in enrollment from 2020 to 2021 based on Race
# calculate percentage change in full time student enrollment from 2020 to 2021 by race
df_MCPS20D%>%
filter(full_part=="FT")%>%
group_by(term_year,race)%>%
count(race)%>%
group_by(race)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 18 x 4
## # Groups: race [9]
## term_year race n pct_change
## <chr> <chr> <int> <dbl>
## 1 2020 Am. Indian / AK Native 5 NA
## 2 2021 Am. Indian / AK Native 1 -80
## 3 2020 Asian 272 NA
## 4 2021 Asian 227 -16.5
## 5 2020 Black / African Am. 389 NA
## 6 2021 Black / African Am. 326 -16.2
## 7 2020 Foreign 103 NA
## 8 2021 Foreign 96 -6.80
## 9 2020 Hawaiian / Pac. Isl. 5 NA
## 10 2021 Hawaiian / Pac. Isl. 3 -40
## 11 2020 Hispanic 534 NA
## 12 2021 Hispanic 596 11.6
## 13 2020 Multi-Race 71 NA
## 14 2021 Multi-Race 63 -11.3
## 15 2020 Unknown 11 NA
## 16 2021 Unknown 3 -72.7
## 17 2020 White 265 NA
## 18 2021 White 241 -9.06
Full time students: There was a 16.5% decline in asian students, 16.1% decline in African American students, a 9.1% decline in white students and 6.8% decline in foreign students. Hispanic students increased by 11.6%.
Part time student: Change in enrollment from 2020 to 2021 based on Race
# calculate percentage change in full time student enrollment from 2020 to 2021 by race
df_MCPS20D%>%
filter(full_part=="PT")%>%
group_by(term_year,race)%>%
count(race)%>%
group_by(race)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 18 x 4
## # Groups: race [9]
## term_year race n pct_change
## <chr> <chr> <int> <dbl>
## 1 2020 Am. Indian / AK Native 4 NA
## 2 2021 Am. Indian / AK Native 1 -75
## 3 2020 Asian 69 NA
## 4 2021 Asian 63 -8.70
## 5 2020 Black / African Am. 177 NA
## 6 2021 Black / African Am. 181 2.26
## 7 2020 Foreign 73 NA
## 8 2021 Foreign 54 -26.0
## 9 2020 Hawaiian / Pac. Isl. 1 NA
## 10 2021 Hawaiian / Pac. Isl. 1 0
## 11 2020 Hispanic 327 NA
## 12 2021 Hispanic 263 -19.6
## 13 2020 Multi-Race 33 NA
## 14 2021 Multi-Race 35 6.06
## 15 2020 Unknown 5 NA
## 16 2021 Unknown 2 -60
## 17 2020 White 112 NA
## 18 2021 White 147 31.2
Part time students: There was an 8.7% decrease in Asian students, a 26% decrease in foreign students, 2.3% increase in african american students and a 19.6% decrease in hispanic students. There was a 31.25% increase in white students.
Gender of Students
# Gender of students part time and full time 2020 vs 2021
ggplot(data=df_MCPS20D, aes(x=sex, fill=sex)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=1,size=3)+
facet_wrap(~term_year+full_part)+
ggtitle("Gender of Students: Full time versus Part time")+
ylab('Frequency')+
xlab("")+
theme(axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
Calculate percentage change in full time student enrollment from 2020 to 2021 by gender
# calculate percentage change in full time student enrollment from 2020 to 2021 by gender
df_MCPS20D%>%
filter(full_part=="FT")%>%
filter(sex=="F"|sex =="M")%>%
group_by(term_year,sex)%>%
count(sex)%>%
group_by(sex)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 4 x 4
## # Groups: sex [2]
## term_year sex n pct_change
## <chr> <chr> <int> <dbl>
## 1 2020 F 793 NA
## 2 2021 F 819 3.28
## 3 2020 M 842 NA
## 4 2021 M 719 -14.6
Full time students: 14% decrease in attendance by male students. A 3.27% decrease in female students.
Calculate percentage change in part time student enrollment from 2020 to 2021 by gender
# calculate percentage change in part time student enrollment from 2020 to 2021 by gender
df_MCPS20D%>%
filter(full_part=="PT")%>%
filter(sex=="F"|sex =="M")%>%
group_by(term_year,sex)%>%
count(sex)%>%
group_by(sex)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 4 x 4
## # Groups: sex [2]
## term_year sex n pct_change
## <chr> <chr> <int> <dbl>
## 1 2020 F 381 NA
## 2 2021 F 345 -9.45
## 3 2020 M 401 NA
## 4 2021 M 395 -1.50
Part time: 9.5% decrease in female students. 1.5% decrease in male students.
Gender and Race breakdown of full time students
# Gender and Race of full time students 2020 vs 2021
df_MCPS20D%>%
filter(sex %in% c("F","M"))%>%
filter(full_part=="FT")%>%
ggplot(., aes(x=race, fill=race)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=0, size=3)+
facet_wrap(~term_year+sex)+
ggtitle("Gender and Race of Full time Students")+
ylab('Frequency')+
xlab("")+
theme(axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
# theme(axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
Full time Student Enrollment Percentages trend by Gender and race
# calculate percentage change in student enrollment from 2020 to 2021 by race and gender
# create data frames with counts of full time students by race and gender
df_MCPS20D%>%
filter(full_part=="FT")%>%
filter(sex=="F"|sex =="M")%>%
group_by(term_year,race,sex)%>%
count(sex)%>%
group_by(race,sex)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 35 x 5
## # Groups: race, sex [18]
## term_year race sex n pct_change
## <chr> <chr> <chr> <int> <dbl>
## 1 2020 Am. Indian / AK Native F 4 NA
## 2 2020 Am. Indian / AK Native M 1 NA
## 3 2021 Am. Indian / AK Native M 1 0
## 4 2020 Asian F 111 NA
## 5 2021 Asian F 115 3.60
## 6 2020 Asian M 159 NA
## 7 2021 Asian M 110 -30.8
## 8 2020 Black / African Am. F 178 NA
## 9 2021 Black / African Am. F 169 -5.06
## 10 2020 Black / African Am. M 202 NA
## # … with 25 more rows
Part time Student Enrollment Percentages trend by Gender and race
# calculate percentage change in student enrollment from 2020 to 2021 by race and gender
# create data frames with counts of full time students by race and gender
df_MCPS20D%>%
filter(full_part=="PT")%>%
filter(sex=="F"|sex =="M")%>%
group_by(term_year,race,sex)%>%
count(sex)%>%
group_by(race,sex)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 31 x 5
## # Groups: race, sex [17]
## term_year race sex n pct_change
## <chr> <chr> <chr> <int> <dbl>
## 1 2020 Am. Indian / AK Native M 4 NA
## 2 2021 Am. Indian / AK Native M 1 -75
## 3 2020 Asian F 30 NA
## 4 2021 Asian F 19 -36.7
## 5 2020 Asian M 37 NA
## 6 2021 Asian M 44 18.9
## 7 2020 Black / African Am. F 79 NA
## 8 2021 Black / African Am. F 84 6.33
## 9 2020 Black / African Am. M 96 NA
## 10 2021 Black / African Am. M 94 -2.08
## # … with 21 more rows
Need to correct file
# Pell Grant
ggplot(data=df_MCPS20D, aes(x=pell_grant, fill=pell_grant)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=3, size=3)+
facet_wrap(~term_year+full_part)+
ggtitle("Pell grant")+
ylab('Frequency')+
xlab("")+
theme(axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
Percentage of each race group in Student Population
df_MCPS20D %>%
group_by(term_year,full_part) %>%
count(pell_grant) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = pell_grant, y = prop)) +
geom_col(aes(fill = pell_grant), position = "dodge") +
geom_text(aes(label = scales::percent(prop,0.1),
y = prop,
group = pell_grant),
position = position_dodge(width = 0.9),
vjust = 0,size=3)+
facet_wrap(~term_year + full_part)+
ggtitle("Proportion of Students receiving Pell Grants")+
ylab('Proportion ')+
xlab("")+
theme(axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
Overall Majors trend
Count of Majors in Full time students in 2020
z1<- df_MCPS20D%>%
filter(full_part=="FT" &term_year =="2020")%>%
ggplot(., aes(x=major, fill=major)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=0, hjust=0, size =3)+
ggtitle("Majors of Full-time Students in 2020 ")+
xlab("Major")+
ylab("Frequency")+
theme(legend.position = "none")
z1 + coord_flip()
Count of Majors in Full time students in 2021
z13<- df_MCPS20D%>%
filter(full_part=="FT" &term_year =="2021")%>%
ggplot(., aes(x=major, fill=major)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=0, hjust=0, size =3)+
ggtitle("Majors of Full-time Students in 2021 ")+
xlab("Major")+
ylab("Frequency")+
theme(legend.position = "none")
z13 + coord_flip()
calculate percentage change in full time student majors from 2020 to 2021
df_MCPS20D%>%
filter(full_part=="FT")%>%
group_by(term_year,major)%>%
count(major)%>%
group_by(term_year)%>%
group_by(major)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 62 x 4
## # Groups: major [33]
## term_year major n pct_change
## <chr> <chr> <int> <dbl>
## 1 2020 0 3 NA
## 2 2021 0 2 -33.3
## 3 2020 American Sign Language 5 NA
## 4 2021 American Sign Language 1 -80
## 5 2020 Applied Geography 1 NA
## 6 2021 Applied Geography 2 100
## 7 2020 Architectural Technology 15 NA
## 8 2021 Architectural Technology 19 26.7
## 9 2020 Art 24 NA
## 10 2021 Art 22 -8.33
## # … with 52 more rows
Count of Majors in Part time students in 2020
z11<- df_MCPS20D%>%
filter(full_part=="PT" &term_year =="2020")%>%
ggplot(., aes(x=major, fill=major)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=0, hjust=0, size =3)+
ggtitle("Majors of Part-time Students in 2020 ")+
xlab("Major")+
ylab("Frequency")+
theme(legend.position = "none")
z11 + coord_flip()
Count of Majors in Part time students in 2021
z12<- df_MCPS20D%>%
filter(full_part=="PT" &term_year =="2021")%>%
ggplot(., aes(x=major, fill=major)) +
geom_bar() +
geom_text(stat='count', aes(label=..count..), vjust=0, hjust=0, size =3)+
ggtitle("Majors of Part-time Students in 2021 ")+
xlab("Major")+
ylab("Frequency")+
theme(legend.position = "none")
z12 + coord_flip()
calculate percentage change in part time student majors from 2020 to 2021
df_MCPS20D%>%
filter(full_part=="PT")%>%
group_by(term_year,major)%>%
count(major)%>%
group_by(term_year)%>%
group_by(major)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)
## # A tibble: 60 x 4
## # Groups: major [32]
## term_year major n pct_change
## <chr> <chr> <int> <dbl>
## 1 2020 0 5 NA
## 2 2020 American Sign Language 1 NA
## 3 2021 American Sign Language 2 100
## 4 2020 Applied Geography 2 NA
## 5 2020 Architectural Technology 13 NA
## 6 2021 Architectural Technology 4 -69.2
## 7 2020 Art 12 NA
## 8 2021 Art 14 16.7
## 9 2020 Broadcast Media 5 NA
## 10 2021 Broadcast Media 4 -20
## # … with 50 more rows
Breakdown of Highschools Full time students in term year 2020 attended in MCPS
df_MCPS20D%>%
filter(full_part=="FT" & term_year=="2020")%>%
group_by(term_year,high_school)%>%
count(high_school)%>%
group_by(term_year)%>%
mutate(total_pop =sum(n))%>%
group_by(high_school)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_pop= (n/total_pop*100))%>%
arrange(desc(pct_pop))
## # A tibble: 25 x 5
## # Groups: high_school [25]
## term_year high_school n total_pop pct_pop
## <chr> <chr> <int> <int> <dbl>
## 1 2020 Gaithersburg High School 132 1655 7.98
## 2 2020 Montgomery Blair High School 105 1655 6.34
## 3 2020 Northwest HS - Germantown 92 1655 5.56
## 4 2020 Paint Branch High School 92 1655 5.56
## 5 2020 Springbrook Sr High School 91 1655 5.50
## 6 2020 Wheaton High School 80 1655 4.83
## 7 2020 Clarksburg High School 76 1655 4.59
## 8 2020 Richard Montgomery High School 76 1655 4.59
## 9 2020 Colonel Zadok Magruder HS 75 1655 4.53
## 10 2020 Albert Einstein HS & MC Art Cn 74 1655 4.47
## # … with 15 more rows
Breakdown of Highschools Full time students in term year 2021 attended in MCPS
df_MCPS20D%>%
filter(full_part=="FT" & term_year=="2021")%>%
group_by(term_year,high_school)%>%
count(high_school)%>%
group_by(term_year)%>%
mutate(total_pop =sum(n))%>%
group_by(high_school)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_pop= (n/total_pop*100))%>%
arrange(desc(pct_pop))
## # A tibble: 25 x 5
## # Groups: high_school [25]
## term_year high_school n total_pop pct_pop
## <chr> <chr> <int> <int> <dbl>
## 1 2021 Montgomery Blair High School 97 1556 6.23
## 2 2021 Paint Branch High School 91 1556 5.85
## 3 2021 Wheaton High School 90 1556 5.78
## 4 2021 Gaithersburg High School 89 1556 5.72
## 5 2021 Northwest HS - Germantown 86 1556 5.53
## 6 2021 Colonel Zadok Magruder HS 84 1556 5.40
## 7 2021 Richard Montgomery High School 78 1556 5.01
## 8 2021 Watkins Mill High School 75 1556 4.82
## 9 2021 Clarksburg High School 74 1556 4.76
## 10 2021 James Hubert Blake High School 70 1556 4.50
## # … with 15 more rows
# calculate percentage change in full time student enrollment from 2020 to 2021 by MCPS highschool
df_MCPS20D%>%
filter(full_part=="FT")%>%
group_by(term_year,high_school)%>%
count(high_school)%>%
group_by(term_year)%>%
group_by(high_school)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)%>%
arrange(desc(pct_change))
## # A tibble: 50 x 4
## # Groups: high_school [25]
## term_year high_school n pct_change
## <chr> <chr> <int> <dbl>
## 1 2021 Rockville High School 65 41.3
## 2 2021 Wheaton High School 90 12.5
## 3 2021 Colonel Zadok Magruder HS 84 12
## 4 2021 Walt Whitman High School 20 11.1
## 5 2021 Seneca Valley High School 55 10
## 6 2021 Sherwood High School 68 9.68
## 7 2021 Bethesda Chevy Chase High Schl 43 4.88
## 8 2021 Watkins Mill High School 75 4.17
## 9 2021 Richard Montgomery High School 78 2.63
## 10 2021 Thomas Sprigg Wootton High Sch 33 0
## # … with 40 more rows
v1<- df_MCPS20D %>%
group_by(term_year,full_part) %>%
filter(full_part=="FT" & term_year=="2020")%>%
count(high_school) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = high_school, y = prop)) +
geom_col(aes(fill=high_school), position = "dodge") +
geom_text(aes(label = scales::percent(prop,0.5),
y = prop,
group = high_school),
position = position_dodge(width = 0.9),
vjust = 0, size=3, hjust=0)+
# facet_wrap(~term_year )+
ggtitle("High schools full time students graduated in term year 2020 graduated")+
ylab('Proportion ')+
xlab("")+
theme(legend.position = "none", axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
v1+ coord_flip()
v1<- df_MCPS20D %>%
group_by(term_year,full_part) %>%
filter(full_part=="FT" & term_year=="2021")%>%
count(high_school) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = high_school, y = prop)) +
geom_col(aes(fill=high_school), position = "dodge") +
geom_text(aes(label = scales::percent(prop,0.5),
y = prop,
group = high_school),
position = position_dodge(width = 0.9),
vjust = 0, size=3, hjust=0)+
# facet_wrap(~term_year )+
ggtitle("High schools full time students graduated in term year 2021 graduated")+
ylab('Proportion ')+
xlab("")+
theme(legend.position = "none", axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
v1+ coord_flip()
Breakdown of Highschools Part time students in term year 2020 attended in MCPS
df_MCPS20D%>%
filter(full_part=="PT" & term_year=="2020")%>%
group_by(term_year,high_school)%>%
count(high_school)%>%
group_by(term_year)%>%
mutate(total_pop =sum(n))%>%
group_by(high_school)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_pop= (n/total_pop*100))%>%
arrange(desc(pct_pop))
## # A tibble: 25 x 5
## # Groups: high_school [25]
## term_year high_school n total_pop pct_pop
## <chr> <chr> <int> <int> <dbl>
## 1 2020 Northwest HS - Germantown 61 801 7.62
## 2 2020 John F. Kennedy High School 56 801 6.99
## 3 2020 Gaithersburg High School 55 801 6.87
## 4 2020 Montgomery Blair High School 54 801 6.74
## 5 2020 Albert Einstein HS & MC Art Cn 44 801 5.49
## 6 2020 Clarksburg High School 43 801 5.37
## 7 2020 Paint Branch High School 41 801 5.12
## 8 2020 Richard Montgomery High School 38 801 4.74
## 9 2020 Watkins Mill High School 38 801 4.74
## 10 2020 Rockville High School 37 801 4.62
## # … with 15 more rows
Breakdown of Highschools Part time students in term year 2021 attended in MCPS
df_MCPS20D%>%
filter(full_part=="PT" & term_year=="2021")%>%
group_by(term_year,high_school)%>%
count(high_school)%>%
group_by(term_year)%>%
mutate(total_pop =sum(n))%>%
group_by(high_school)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_pop= (n/total_pop*100))%>%
arrange(desc(pct_pop))
## # A tibble: 25 x 5
## # Groups: high_school [25]
## term_year high_school n total_pop pct_pop
## <chr> <chr> <int> <int> <dbl>
## 1 2021 Gaithersburg High School 48 747 6.43
## 2 2021 Northwest HS - Germantown 48 747 6.43
## 3 2021 Montgomery Blair High School 39 747 5.22
## 4 2021 Colonel Zadok Magruder HS 38 747 5.09
## 5 2021 Northwood High School 38 747 5.09
## 6 2021 Paint Branch High School 37 747 4.95
## 7 2021 Quince Orchard Sr High School 35 747 4.69
## 8 2021 Walter Johnson High School 35 747 4.69
## 9 2021 Albert Einstein HS & MC Art Cn 34 747 4.55
## 10 2021 Richard Montgomery High School 33 747 4.42
## # … with 15 more rows
# calculate percentage change in full time student enrollment from 2020 to 2021 by MCPS highschool
df_MCPS20D%>%
filter(full_part=="PT")%>%
group_by(term_year,high_school)%>%
count(high_school)%>%
group_by(term_year)%>%
group_by(high_school)%>%
arrange(term_year,.by_group=TRUE)%>%
mutate(pct_change= (n-lag(n))/lag(n)*100)%>%
arrange(desc(pct_change))
## # A tibble: 50 x 4
## # Groups: high_school [25]
## term_year high_school n pct_change
## <chr> <chr> <int> <dbl>
## 1 2021 Thomas Sprigg Wootton High Sch 24 100
## 2 2021 Walter Johnson High School 35 75
## 3 2021 Winston Churchill High School 15 66.7
## 4 2021 Poolesville Jr-Sr High School 14 55.6
## 5 2021 Northwood High School 38 52
## 6 2021 Sherwood High School 30 36.4
## 7 2021 Walt Whitman High School 14 27.3
## 8 2021 Colonel Zadok Magruder HS 38 15.2
## 9 2021 James Hubert Blake High School 30 11.1
## 10 2021 Damascus High School 19 5.56
## # … with 40 more rows
v3<- df_MCPS20D %>%
group_by(term_year,full_part) %>%
filter(full_part=="PT" & term_year=="2020")%>%
count(high_school) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = high_school, y = prop)) +
geom_col(aes(fill=high_school), position = "dodge") +
geom_text(aes(label = scales::percent(prop,0.5),
y = prop,
group = high_school),
position = position_dodge(width = 0.9),
vjust = 0, size=3, hjust=0)+
# facet_wrap(~term_year )+
ggtitle("High schools Part time students graduated in term year 2020 graduated")+
ylab('Proportion ')+
xlab("")+
theme(legend.position = "none", axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
v3+ coord_flip()
v4<- df_MCPS20D %>%
group_by(term_year,full_part) %>%
filter(full_part=="PT" & term_year=="2021")%>%
count(high_school) %>%
mutate(prop = n/sum(n)) %>%
ggplot(aes(x = high_school, y = prop)) +
geom_col(aes(fill=high_school), position = "dodge") +
geom_text(aes(label = scales::percent(prop,0.5),
y = prop,
group = high_school),
position = position_dodge(width = 0.9),
vjust = 0, size=3, hjust=0)+
# facet_wrap(~term_year )+
ggtitle("High schools Part time students graduated in term year 2021 graduated")+
ylab('Proportion ')+
xlab("")+
theme(legend.position = "none", axis.text.x=element_blank(),strip.background = element_blank(),panel.grid = element_blank())
v4 + coord_flip()
Boxplots of hours_attempted by year by MCPS students 20yrs and younger
p11 = ggplot(df_MCPS20D, aes(hours_attempted))
p11 + geom_boxplot(aes(colour = term_year)) +
facet_wrap(~full_part)
Students who register for more than 18 credits require special permission from the department. Further more a full time student is classified as someone who is enrolled in 12 or more credits. A part time student is classified as someone who is enrolled in less than 12 credits. However based on thge dataset, a number of full time students attempt less than 12 credits and large a number of part time students attempt more than 12 hours.
Boxplots of hours_attempted by year by Full time MCPS students 20yrs and younger
df_MCPS20D%>%filter(full_part=="FT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_attempted))+
geom_boxplot(aes(colour = term_year)) +
facet_wrap(~race)
Boxplots of hours_attempted by year by Part time MCPS students 20yrs and younger
df_MCPS20D%>%filter(full_part=="PT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_attempted))+
geom_boxplot(aes(colour = term_year)) +
facet_wrap(~race)
There are not many outliers in the part time student groups. Term year 2021 seems to have more outliers on the upper end.
Density plot of hours_attempted by year
ggplot(df_MCPS20D, aes(hours_attempted, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~full_part)+
xlab("Hours attempted") +
ylab( "Density")+
ggtitle(" Hours Attempted by Full-time Students vs Part-time Students")
Hours attempted by full time students
df_MCPS20D%>%filter(full_part=="FT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_attempted, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~race)+
xlab("Hours attempted") +
ylab( "Density") +
ggtitle(" Hours Attempted by Full-time Students")
Fivenum Summary of Full time students
df_MCPS20D%>% filter(full_part=="FT")%>%
group_by(race,term_year)%>%
summarise(n = n(),
min = fivenum(hours_attempted)[1],
Q1 = fivenum(hours_attempted)[2],
median = fivenum(hours_attempted)[3],
Q3 = fivenum(hours_attempted)[4],
max = fivenum(hours_attempted)[5],
mean= mean(hours_attempted),
sd = sd(hours_attempted))
## `summarise()` has grouped output by 'race'. You can override using the `.groups` argument.
## # A tibble: 18 x 10
## # Groups: race [9]
## race term_year n min Q1 median Q3 max mean sd
## <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Am. Indian / AK N… 2020 5 6 13 17 19 36 18.2 11.1
## 2 Am. Indian / AK N… 2021 1 13 13 13 13 13 13 NA
## 3 Asian 2020 272 6 13 15 20 52 17.7 8.13
## 4 Asian 2021 227 7 13 15 17 46 16.8 7.00
## 5 Black / African A… 2020 389 5 12 13 14 42 13.7 3.53
## 6 Black / African A… 2021 326 4 12 14 16 38 14.9 4.31
## 7 Foreign 2020 103 7 12 14 17 31 14.8 4.26
## 8 Foreign 2021 96 7 12.5 15 16 37 15.7 5.29
## 9 Hawaiian / Pac. I… 2020 5 9 12 13 13 15 12.4 2.19
## 10 Hawaiian / Pac. I… 2021 3 12 15.5 19 24.5 30 20.3 9.07
## 11 Hispanic 2020 534 4 12 13 15 39 14.2 4.43
## 12 Hispanic 2021 596 3 12 14 16 43 15.0 4.33
## 13 Multi-Race 2020 71 6 12 13 17 44 16.7 8.04
## 14 Multi-Race 2021 63 6 12 14 16.5 43 15.7 6.22
## 15 Unknown 2020 11 9 12 14 15 31 15 5.78
## 16 Unknown 2021 3 12 12 12 13 14 12.7 1.15
## 17 White 2020 265 8 12 13 16 46 15.9 7.08
## 18 White 2021 241 7 13 14 17 54 16.5 6.37
Hours attempted by part time students
df_MCPS20D%>%filter(full_part=="PT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_attempted, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~race)+
xlab("Hours attempted") +
ylab( "Density")+
ggtitle(" Hours Attempted by Part-time Students")
Fivenum Summary of Part time students
df_MCPS20D%>% filter(full_part=="PT")%>%
group_by(race,term_year)%>%
summarise(n = n(),
min = fivenum(hours_attempted)[1],
Q1 = fivenum(hours_attempted)[2],
median = fivenum(hours_attempted)[3],
Q3 = fivenum(hours_attempted)[4],
max = fivenum(hours_attempted)[5],
mean= mean(hours_attempted),
sd = sd(hours_attempted))
## `summarise()` has grouped output by 'race'. You can override using the `.groups` argument.
## # A tibble: 18 x 10
## # Groups: race [9]
## race term_year n min Q1 median Q3 max mean sd
## <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Am. Indian / AK N… 2020 4 3 3 5.5 8.5 9 5.75 3.20
## 2 Am. Indian / AK N… 2021 1 6 6 6 6 6 6 NA
## 3 Asian 2020 69 2 6 9 10 33 8.62 4.94
## 4 Asian 2021 63 3 7.5 9 11 21 8.90 3.64
## 5 Black / African A… 2020 177 1 6 7 9 15 7.28 2.62
## 6 Black / African A… 2021 181 1 6 8 10 25 7.80 3.37
## 7 Foreign 2020 73 3 6 8 10 23 8.18 3.89
## 8 Foreign 2021 54 3 5 9 10 29 8.61 4.38
## 9 Hawaiian / Pac. I… 2020 1 6 6 6 6 6 6 NA
## 10 Hawaiian / Pac. I… 2021 1 5 5 5 5 5 5 NA
## 11 Hispanic 2020 327 1 6 8 9 21 7.84 3.06
## 12 Hispanic 2021 263 1 6 8 11 42 8.73 4.41
## 13 Multi-Race 2020 33 1 4 8 9 12 7.03 2.98
## 14 Multi-Race 2021 35 3 6 9 10 26 8.34 3.90
## 15 Unknown 2020 5 7 9 10 10 10 9.2 1.30
## 16 Unknown 2021 2 4 4 6.5 9 9 6.5 3.54
## 17 White 2020 112 1 6 8 10 33 8.15 4.47
## 18 White 2021 147 3 5 8 10 39 8.43 5.02
Boxplots of Hours Earned by year by MCPS students 20yrs and younger
p11 = ggplot(df_MCPS20D, aes(hours_earned))
p11 + geom_boxplot(aes(colour = term_year)) +
facet_wrap(~full_part)
Boxplots of hours_earned by year by Full time MCPS students 20yrs and younger
df_MCPS20D%>%filter(full_part=="FT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_earned))+
geom_boxplot(aes(colour = term_year)) +
facet_wrap(~race)
Boxplots of hours_earned by year by Part time MCPS students 20yrs and younger
df_MCPS20D%>%filter(full_part=="PT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_earned))+
geom_boxplot(aes(colour = term_year)) +
facet_wrap(~race)
There are not many outliers in the part time student groups. Term year 2021 seems to have more outliers on the upper end.
Density plot of hours_earned by year
ggplot(df_MCPS20D, aes(hours_earned, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~full_part)+
xlab("Hours Earned") +
ylab( "Density")+
ggtitle(" Hours Earned by Full-time vs Part-time Students")
Hours_earned by full time students
df_MCPS20D%>%filter(full_part=="FT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_earned, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~race)+
xlab("Hours Earned") +
ylab( "Density")+
ggtitle(" Hours Earned by Full-time Students")
Fivenum Summary of Full time students
df_MCPS20D%>% filter(full_part=="FT")%>%
group_by(race,term_year)%>%
summarise(n = n(),
min = fivenum(hours_earned)[1],
Q1 = fivenum(hours_earned)[2],
median = fivenum(hours_earned)[3],
Q3 = fivenum(hours_earned)[4],
max = fivenum(hours_earned)[5],
mean= mean(hours_earned),
sd = sd(hours_earned))
## `summarise()` has grouped output by 'race'. You can override using the `.groups` argument.
## # A tibble: 18 x 10
## # Groups: race [9]
## race term_year n min Q1 median Q3 max mean sd
## <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Am. Indian / AK N… 2020 5 0 10 14 19 36 15.8 13.3
## 2 Am. Indian / AK N… 2021 1 13 13 13 13 13 13 NA
## 3 Asian 2020 272 0 9 13 17 52 14.7 9.15
## 4 Asian 2021 227 0 9 12 16 46 13.4 8.53
## 5 Black / African A… 2020 389 0 6 9 12 42 8.85 5.58
## 6 Black / African A… 2021 326 0 6 9 13 37 9.55 6.53
## 7 Foreign 2020 103 0 6 9 13 31 10.4 6.44
## 8 Foreign 2021 96 0 6 10 13 37 10.6 7.56
## 9 Hawaiian / Pac. I… 2020 5 0 0 9 12 13 6.8 6.38
## 10 Hawaiian / Pac. I… 2021 3 9 12.5 16 23 30 18.3 10.7
## 11 Hispanic 2020 534 0 6 9 12 38 9.57 6.50
## 12 Hispanic 2021 596 0 6 10 13 33 9.73 6.36
## 13 Multi-Race 2020 71 0 7 12 15 44 13.4 9.76
## 14 Multi-Race 2021 63 0 6 10 13.5 43 10.8 8.69
## 15 Unknown 2020 11 3 5 9 13 31 10.5 7.90
## 16 Unknown 2021 3 3 5 7 9.5 12 7.33 4.51
## 17 White 2020 265 0 7 11 15 46 12.3 8.88
## 18 White 2021 241 0 7 12 15 54 12.5 8.22
hours_earned by part time students
df_MCPS20D%>%filter(full_part=="PT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_earned, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~race)+
xlab("Hours Earned") +
ylab( "Density")+
ggtitle(" Hours Earned by Part-time Students")
Fivenum Summary of Part time students
df_MCPS20D%>% filter(full_part=="PT")%>%
group_by(race,term_year)%>%
summarise(n = n(),
min = fivenum(hours_earned)[1],
Q1 = fivenum(hours_earned)[2],
median = fivenum(hours_earned)[3],
Q3 = fivenum(hours_earned)[4],
max = fivenum(hours_earned)[5],
mean= mean(hours_earned),
sd = sd(hours_earned))
## `summarise()` has grouped output by 'race'. You can override using the `.groups` argument.
## # A tibble: 18 x 10
## # Groups: race [9]
## race term_year n min Q1 median Q3 max mean sd
## <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Am. Indian / AK … 2020 4 0 1.5 3 3 3 2.25 1.5
## 2 Am. Indian / AK … 2021 1 3 3 3 3 3 3 NA
## 3 Asian 2020 69 0 0 4 6 33 4.81 5.50
## 4 Asian 2021 63 0 3 3 6 21 4.73 4.48
## 5 Black / African … 2020 177 0 0 3 4 11 2.73 2.82
## 6 Black / African … 2021 181 0 0 1 6 22 2.81 3.72
## 7 Foreign 2020 73 0 0 3 6 21 3.96 4.60
## 8 Foreign 2021 54 0 0 0 6 29 3.09 5.04
## 9 Hawaiian / Pac. … 2020 1 0 0 0 0 0 0 NA
## 10 Hawaiian / Pac. … 2021 1 3 3 3 3 3 3 NA
## 11 Hispanic 2020 327 0 0 3 6 21 3.48 3.84
## 12 Hispanic 2021 263 0 0 3 6 42 4.65 5.14
## 13 Multi-Race 2020 33 0 1 3 9 11 4.27 3.83
## 14 Multi-Race 2021 35 0 0 3 6 26 4.11 4.95
## 15 Unknown 2020 5 0 1 1 4 9 3 3.67
## 16 Unknown 2021 2 3 3 3.5 4 4 3.5 0.707
## 17 White 2020 112 0 0 4 7 27 4.74 4.87
## 18 White 2021 147 0 3 4 7 33 5.16 5.14
Boxplots of GPA by year by MCPS students 20yrs and younger
p11 = ggplot(df_MCPS20D, aes(mc_gpa))
p11 + geom_boxplot(aes(colour = term_year)) +
facet_wrap(~full_part)
Boxplots of GPA by year by Full time MCPS students 20yrs and younger
df_MCPS20D%>%filter(full_part=="FT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(mc_gpa))+
geom_boxplot(aes(colour = term_year)) +
facet_wrap(~race)
Boxplots of GPA by year by Part time MCPS students 20yrs and younger
df_MCPS20D%>%filter(full_part=="PT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(mc_gpa))+
geom_boxplot(aes(colour = term_year)) +
facet_wrap(~race)
Density plot of GPA by year
ggplot(df_MCPS20D, aes(mc_gpa, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~full_part)+
xlab("GPA") +
ylab( "Density")+
ggtitle(" GPA by Full-time vs Part-time Students")
GPA by full time students
df_MCPS20D%>%filter(full_part=="FT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(mc_gpa, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~race)+
xlab("GPA") +
ylab( "Density")+
ggtitle(" GPA of Full-time Students")
Fivenum Summary of Full time students
df_MCPS20D%>% filter(full_part=="FT")%>%
group_by(race,term_year)%>%
summarise(n = n(),
min = fivenum(mc_gpa)[1],
Q1 = fivenum(mc_gpa)[2],
median = fivenum(mc_gpa)[3],
Q3 = fivenum(mc_gpa)[4],
max = fivenum(mc_gpa)[5],
mean= mean(mc_gpa),
sd = sd(mc_gpa))
## `summarise()` has grouped output by 'race'. You can override using the `.groups` argument.
## # A tibble: 18 x 10
## # Groups: race [9]
## race term_year n min Q1 median Q3 max mean sd
## <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Am. Indian / AK … 2020 5 0 2.35 2.9 3.5 4 2.55 1.55
## 2 Am. Indian / AK … 2021 1 2.77 2.77 2.77 2.77 2.77 2.77 NA
## 3 Asian 2020 272 0 2.33 3.3 3.73 4 2.93 1.03
## 4 Asian 2021 227 0 2.5 3.23 3.71 4 2.88 1.12
## 5 Black / African … 2020 389 0 1.5 2.5 3.14 4 2.25 1.18
## 6 Black / African … 2021 326 0 1.33 2.67 3.4 4 2.31 1.30
## 7 Foreign 2020 103 0 2 3 3.65 4 2.71 1.20
## 8 Foreign 2021 96 0 1.46 2.82 3.69 4 2.48 1.35
## 9 Hawaiian / Pac. … 2020 5 0 0 2.25 2.67 3.77 1.74 1.68
## 10 Hawaiian / Pac. … 2021 3 1.75 2.22 2.68 3.34 4 2.81 1.13
## 11 Hispanic 2020 534 0 1.5 2.70 3.44 4 2.38 1.25
## 12 Hispanic 2021 596 0 1.23 2.66 3.33 4 2.29 1.30
## 13 Multi-Race 2020 71 0 2 2.75 3.5 4 2.59 1.13
## 14 Multi-Race 2021 63 0 1.5 2.6 3.54 4 2.37 1.35
## 15 Unknown 2020 11 0.33 2.12 2.33 3.32 4 2.55 1.00
## 16 Unknown 2021 3 2.55 2.65 2.75 3.38 4 3.1 0.786
## 17 White 2020 265 0 1.8 3 3.6 4 2.59 1.22
## 18 White 2021 241 0 2 3 3.69 4 2.67 1.26
GPA of Part time students
df_MCPS20D%>%filter(full_part=="PT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(mc_gpa, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~race)+
xlab("Hours Earned") +
ylab( "Density")+
ggtitle(" GPA of Part-time Students")
Fivenum Summary of Part time students
df_MCPS20D%>% filter(full_part=="PT")%>%
group_by(race,term_year)%>%
summarise(n = n(),
min = fivenum(mc_gpa)[1],
Q1 = fivenum(mc_gpa)[2],
median = fivenum(mc_gpa)[3],
Q3 = fivenum(mc_gpa)[4],
max = fivenum(mc_gpa)[5],
mean= mean(mc_gpa),
sd = sd(mc_gpa))
## `summarise()` has grouped output by 'race'. You can override using the `.groups` argument.
## # A tibble: 18 x 10
## # Groups: race [9]
## race term_year n min Q1 median Q3 max mean sd
## <chr> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Am. Indian / AK … 2020 4 0 0.5 1.25 2.25 3 1.38 1.25
## 2 Am. Indian / AK … 2021 1 2 2 2 2 2 2 NA
## 3 Asian 2020 69 0 0 2.3 3.33 4 2.01 1.54
## 4 Asian 2021 63 0 0.8 2 3.28 4 1.94 1.48
## 5 Black / African … 2020 177 0 0 1.33 2.71 4 1.46 1.38
## 6 Black / African … 2021 181 0 0 0.33 2.33 4 1.13 1.32
## 7 Foreign 2020 73 0 0 2 3 4 1.65 1.51
## 8 Foreign 2021 54 0 0 0 2.67 4 1.20 1.46
## 9 Hawaiian / Pac. … 2020 1 0 0 0 0 0 0 NA
## 10 Hawaiian / Pac. … 2021 1 4 4 4 4 4 4 NA
## 11 Hispanic 2020 327 0 0 1.5 3 4 1.60 1.50
## 12 Hispanic 2021 263 0 0 2 3 4 1.73 1.43
## 13 Multi-Race 2020 33 0 0.67 2 3.5 4 1.99 1.51
## 14 Multi-Race 2021 35 0 0 2.5 3 4 1.80 1.55
## 15 Unknown 2020 5 0 0.75 2 3.67 4 2.08 1.75
## 16 Unknown 2021 2 3 3 3.5 4 4 3.5 0.707
## 17 White 2020 112 0 0 2 3.33 4 1.86 1.54
## 18 White 2021 147 0 0.55 2.5 3.33 4 2.16 1.49
# Hours Earned Rate
Density plot of Hours Earned Rate by year
ggplot(df_MCPS20D, aes(hours_earned_rate, fill = term_year)) + geom_density(alpha = 0.3) +
facet_wrap(~full_part)+
xlab("Hours Earned Rate") +
ylab( "Density")+
xlim(0,1)
Boxplots of Hours Earned Rate of Full time MCPS students 20yrs and younger
df_MCPS20D%>%filter(full_part=="FT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_earned_rate))+
geom_boxplot(aes(colour = term_year)) +
facet_wrap(~race)
Boxplots of Hours Earned Rate of Part time MCPS students 20yrs and younger
df_MCPS20D%>%filter(full_part=="PT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_earned_rate))+
geom_boxplot(aes(colour = term_year)) +
facet_wrap(~race)
Hours Earned Rate of full time students
df_MCPS20D%>%filter(full_part=="FT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_earned_rate, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~race)+
xlab("GPA") +
ylab( "Density")+
ggtitle(" Hours Earned Rate of Full-time Students")
Hours Earned Rate of part time students
df_MCPS20D%>%filter(full_part=="PT")%>%
filter(race=="White" |race=="Asian" |race=="Hispanic" |race=="Black / African Am." )%>%
ggplot(., aes(hours_earned_rate, fill = term_year)) + geom_density(alpha = 0.2) +
facet_wrap(~race)+
xlab("GPA") +
ylab( "Density")+
ggtitle(" Hours Earned Rate of Part-time Students")
Distribution of Variables and Correlation : Full time Students 2020
library(GGally)
# plot distributions and correlation of variables
df_MCPS20D%>% filter(term_year=="2020")%>%
filter(full_part=="FT")%>%
ggpairs(., columns = c("hours_attempted","hours_earned", "mc_gpa","hours_earned_rate"))
Distribution of Variables and Correlation : Full time Students 2021
library(GGally)
# plot distributions and correlation of variables
df_MCPS20D%>% filter(term_year=="2021")%>%
filter(full_part=="FT")%>%
ggpairs(., columns = c("hours_attempted","hours_earned", "mc_gpa","hours_earned_rate"))
Distribution of Variables and Correlation : Part time Students 2020
library(GGally)
# plot distributions and correlation of variables
df_MCPS20D%>% filter(term_year=="2020")%>%
filter(full_part=="PT")%>%
ggpairs(., columns = c("hours_attempted","hours_earned", "mc_gpa","hours_earned_rate"))
Distribution of Variables and Correlation : Part time Students 2021
library(GGally)
# plot distributions and correlation of variables
df_MCPS20D%>% filter(term_year=="2021")%>%
filter(full_part=="PT")%>%
ggpairs(., columns = c("hours_attempted","hours_earned", "mc_gpa","hours_earned_rate"))